A Separate-and-Learn Approach to EM Learning of PCFGs
نویسندگان
چکیده
Wepropose a new approach to EM learning of PCFGs. We completely separate the process of EM learning from that of parsing, and for the former, we introduce a new EM algorithm called the graphical EM algorithm that runs on a new data structure called support graphs extracted from WFSTs (well formed substring tables) of various parsers. Learning experiments with PCFGs using two Japanese corpora indicate that our approach can signi cantly outperform the existing approaches using the Inside-Outside algorithm (Baker, 1979) and Stolcke's EM algorithm (Stolcke, 1995).
منابع مشابه
Experiments with Spectral Learning of Latent-Variable PCFGs
Latent-variable PCFGs (L-PCFGs) are a highly successful model for natural language parsing. Recent work (Cohen et al., 2012) has introduced a spectral algorithm for parameter estimation of L-PCFGs, which—unlike the EM algorithm—is guaranteed to give consistent parameter estimates (it has PAC-style guarantees of sample complexity). This paper describes experiments using the spectral algorithm. W...
متن کاملDesigning an Optimal Pattern of General Medical Course Curriculum: an Effective Step in Enhancing How to Learn
Introduction: In today's world with a vast amount of information and knowledge, medical students should learn how to become effective physicians. Therefore, the competencies required for lifelong learning in the curriculum must be considered. The purpose of this study was to present a desirable general medical curriculum with emphasis on lifelong learning. Methods: The present study was Mixe...
متن کاملNon-Local Modeling with a Mixture of PCFGs
While most work on parsing with PCFGs has focused on local correlations between tree configurations, we attempt to model non-local correlations using a finite mixture of PCFGs. A mixture grammar fit with the EM algorithm shows improvement over a single PCFG, both in parsing accuracy and in test data likelihood. We argue that this improvement comes from the learning of specialized grammars that ...
متن کاملParameter Learning of Logic Programs for Symbolic-Statistical Modeling
We propose a logical/mathematical framework for statistical parameter learning of parameterized logic programs, i.e. de nite clause programs containing probabilistic facts with a parameterized distribution. It extends the traditional least Herbrand model semantics in logic programming to distribution semantics , possible world semantics with a probability distribution which is unconditionally a...
متن کاملParallel EM Learning for Symbolic- Statistical Models
EM learning, i.e. parameter learning for probabilistic models using the EM algorithm, requires a larger amount of time and memory space as data size increases. One way to cope with this problem is to take advantage of the power of parallel computing. In this paper, we introduce a data-parallel algorithm for EM learning applicable to the probabilistic models represented by PRISM, a programming l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001